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Abstract 

The e- approximate degree of a Boolean function / : { — 1,1}" — > { — 1,1} is the minimum degree of 
a real polynomial that approximates / to within e in the laa norm. We prove several lower bounds on 
this important complexity measure by explicitly constructing solutions to the dual of an appropriate 
linear program. Our first result resolves the e-approximate degree of the two-level AND-OR tree for any 
constant e > 0. We show that this quantity is Q{^/n), closing a line of incrementally larger lower bounds 
[3l ll2|[25]|36ll38| . The same lower bound was recently obtained independently by Sherstov using related 
techniques [31] . Our second result gives an explicit dual polynomial that witnesses a tight lower bound for 
the approximate degree of any symmetric Boolean function, addressing a question of Spalek [ID]. Our final 
contribution is to reprove several Markov-type inequalities from approximation theory by constructing 
explicit dual solutions to natural linear programs. These inequalities underly the proofs of many of the 
best-known approximate degree lower bounds, and have important uses throughout theoretical computer 



1 Introduction 

Approximate degree is an important measure of the complexity of a Boolean function. It captures whether 
a function can be approximated by a low-degree polynomial with real coefficients in the ^oo norm, and 
it has many applications in theoretical computer science. The study of approximate degree has enabled 
progress in circuit complexity [71[51 l23[l35| . quantum computing (where it has been used to prove lower 
bounds on quantum query complexity, e.g. plISlITT]). communication complexity [4l [T0l[20ll33ll37[[3"9[|40j . and 
computational learning theory (where approximate degree upper bounds underly the best known algorithms 
for PAC learning DNF formulas and agnostically learning disjunctions) P^[T5] . 

In this paper, we seek to advance our understanding of this fundamental complexity measure. We focus 
on proving approximate degree lower bounds by specifying explicit dual polynomials ^ which are dual solutions 
to a certain linear program capturing the approximate degree of any function. These polynomials act as 
certificates of the high approximate degree of a function. Their construction is of interest because these dual 
objects have been used recently to resolve several long-standing open problems in communication complexity 
(e.g. [4l[T0 l [20 l [33 l l39 p 40] ) . See the survey of Sherstov [32] for an excellent overview of this body of literature. 

Our Contributions. Our first result resolves the approximate degree of the function f{x) = ^jLi ^ij^ 
showing this quantity is &{N). Known as the two-level AND-OR tree, / is the simplest function whose 
approximate degree was not previously characterized. A series of works spanning nearly two decades proved 
incrementally larger lower bounds on the approximate degree of this function, and this question was recently 
re-posed by Aaronson in a tutorial at FOCS 2008 [l]. Our proof not only yields a tight lower bound, but it 
specifies an explicit dual polynomial for the high approximate degree of /, answering a question of Spalek 
[40] in the affirmative. 

Our second result gives an explicit dual polynomial witnessing the high approximate degree of any 
symmetric Boolean function, recovering a well-known result of Paturi [2 6) . Our solution builds on work of 
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Spalek |40| , who gave an explicit dual polynomial for the OR function, and addresses an open question from 
that work. 

Our final contribution is to reprove several classical Markov-type inequalities from approximation theory. 
These inequalities bound the derivative of a polynomial in terms of its degree. Combined with the well-known 
symmetrization technique (see e.g. [T|I23|). Markov- type inequalties have traditionally been the primary tool 
used to prove approximate degree lower bounds on Boolean functions (e.g. [2,3,25,38 ). Our proofs of these 
inequalities specify explicit dual solutions to a natural linear program (that differs from the one used to 
prove our first two results). While these inequalities have been known for over a century [9l[2Tl[22], to the 
best of our knowledge our proof technique is novel, and we believe it sheds new light on these results. 



2 Preliminaries 

We work with Boolean functions / : { — 1, 1}" {^1, 1} under the standard convention that 1 corresponds 
to logical false, and —1 corresponds to logical true. We let ||/||oo = max^-g^.i ij |/(a;)| denote the £oo norm 
of /. The e-approximate degree of a function / : {—1,1}" { — 1,1}, denoted deg^{f), is the minimum 
(total) degree of any real polynomial p such that \\p — /||oo < e, i-e. \p{x) — f{x)\ < e for all x G {—1, 1}". 
We use dcg(/) to denote deg]^/3(/), and use this to refer to the approximate degree of a function without 

qualification. The choice of 1/3 is arbitrary, as deg(/) is related to deg^{f) by a constant factor for any 
constant e g (0,1). We let 0R„ and AND„ denote the OR function and AND function on n variables 
respectively, and we let 1„ G {—1,1}" denotes the n-dimensional all-ones vector. Define sgn(x) = —1 if 
a; < and 1 otherwise. 

In addition to approximate degree, block sensitivity is also an important measure of the complexity of a 
Boolean function. We introduce this measure because functions with low block sensitivity are an "easy case" 
in the analysis of Theorem[2]below. The block sensitivity bsa;(/) of a Boolean function / : {—1, 1}" — > {—1, 1} 
at the point x is the maximum number of pairwise disjoint subsets 5*1, S2, S3, ■ ■ ■ C {1, 2, . . . , n} such that 
f{x) 7^ f{x^'^) — f{x^^) = f{x^^) = . . . Here, x^ denotes the vector obtained from x by negating each entry 
whose index is in S*. The block sensitivity bs(/) of / is the maximum of hsx{f) over all xe}— 1,1}". 



2.1 A Dual Characterization of Approximate Degree 



For a subset S C {1, 



.,n 



} and x e {-1,1}", let xs{x) = U 



Given a Boolean function /, let 



p{x) = J2\s\<d'^sXs{x) be a polynomial of degree d that minimizes \\p — f\\. 
real numbers. Then p is an optimum of the following linear program. 



where the coefficients cs are 



min e 




such that 


/(■^) - E\s\<dCsXsix) 


< e for each x E { — 1, 1}" 




for each jS*] < 


e > 





The dual LP is as follows. 



max Exe{-i,i}" Hx)fix) 

such that J2x€{-is}" \^(^)\ = ^ 

Ea;e{-i,i}" 'l^i^)xsix) = for each |5| < d 
(j){x) G R for each x £ {-1, 1}'' 



Strong LP-duality yields the following well-known dual characterization of approximate degree (cf. |33j). 
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Theorem 1. Let f : {—1, 1}" — > { — 1, 1} be a Boolean function. Then degg(/) > d if and only if there is a 
polynomial (j) : { — 1, 1}" — > K such that 

^ fix)(b{x) > e, (1) 

and 

(f>{x)xs{x) — for each \S\ < d. (3) 

If (f) satisfies Eq. ([3]), we say (f) has pure high degree d. We refer to any feasible solution (j) to the dual LP 
as a dual polynomial for /. 



3 A Dual Polynomial for the AND- OR Tree 

Define AND-OR^^^ : {-1, 1}*^^ ^ {-1, 1} by f{x) = A,f£i vjti x^j. AND-ORj^ is known as the two-level 
AND-OR tree, and its approximate degree has resisted characterization for close to two decades. Nisan and 
Szegedy proved an il^N^/"^) lower bound on deg(AND-ORj^) in [25]. This was subsequently improved to 
N log N) by Shi [38], and improved further to Q{N'^^^) by Ambainis [3]. Most recently, Sherstov proved 
an il{N^^^) lower bound in [36], which was the best lower bound prior to our work. The best upper bound 
is 0{N) due to H0yer, Mosca, and de Wolf [12], which matches our new lower bound. 

By refining Sherstov's analysis in [36], we will show that deg(AND-ORj^) = ^{y/ MN), which matches 
an upper bound implied by a result of Sherstov |34j . In particular, this implies that the approximate degree 
of the two-level AND-OR tree is e(iV). 

Theorem 2. deg(AND-ORjV0 = OiVMN). 

Independent work by Sherstov. Independently of our work, Sherstov [3T] has discovered the same 
n{VMN) lower bound on deg(AND-ORjV^). Both his proof and ours exploit the fact that the OR function 
has a dual polynomial with one-sided error. Our proof proceeds by constructing an explicit dual polynomial 
for AND-OR]^, by combining a dual polynomial for ORat with a dual polynomial for ANDm- In contrast, 
Sherstov mixes the primal and dual views: his proof combines a dual polynomial for OR^v with an approxi- 
mating polynomial p for AND-OR^^ to construct an approximating polynomial q for ANDm- The proof in 
[31j shows that q has much lower degree than p, so the desired lower bound on the degree of p follows from 
known lower bounds on the degree of q. 

The proof of [JT] is short (barely more than a page), while our proof has the benefit of yielding an explicit 
dual polynomial witnessing the lower bound. 

3.1 Proof Outline 

Our proof is a refinement of a result of Sherstov |36| , which roughly showed that approximate degree increases 
multiplicatively under function composition. Specifically, Sherstov showed the following. 

Proposition 3 ([Ml Theorem 3.3]). Let F : {-1,1}*^ {-1,1} and f : {-1,1}^ ^ {-1,1} be given 
functions. Then for all e,S > 0, 

dcg,_45 bs(F)(^^(/, ■•■,/))> deg,(i^) degi_,(/). 
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Sherstov's proof of Proposition [3] proceeds by taking a dual witness 'I' to the high e-approximate degree 
of -F, and combining it with a dual witness "0 to the high (1 — (5)-approximate degree of / to obtain a dual 
witness C for the high (e — 4(5bs(i^))-approxiniate degree of F{f, ...,/). His proof proceeds in two steps: he 
first shows that C has pure-high degree at least deg^{F) degi_g{f), and then he lower bounds the correlation 
of C with F{f, ...,/). The latter step of this analysis yields a lower bound on the correlation of C with 
F{f, ...,/) that deteriorates rapidly as the block sensitivity bs(i^) grows. 

Proposition [3] itself docs not yield a tight lower bound for dcg(AND-ORjlf ), because the function AND a/ 
has maximum block sensitivity bs(ANDA/) = M. We address this by refining the second step of Sherstov's 
analysis in the case where F = AND a/ and / = ORat. We leverage two facts. First, although the block 
sensitivity of ANDm is high, it is only high at one input, namely the all-true input. At all other inputs, 
ANDm has low block sensitivity and the analysis of Proposition [3] is tight. Second, we use the fact that any 
dual witness to the high approximate degree of ORat has one-sided error. Namely, if ipi^) < for such a 
dual witness ip, then we know that ip{x) agrees in sign with ORAr(a;). This property allows us to handle the 
all-true input to AND a/ separately: we use it show that despite the high block-sensitivity of ANDjvf at the 
all-true input y, this input nonetheless contributes positively to the correlation between C and F(f, ...,/). 
The details of our construction follow. 



3.2 Proof of Thm. [2] 

Nisan and Szegedy proved the now well-known result that for any constant < e < 1, degg(AND„) = 
degg(OR„) = Q{y/n). Let * : {-1, 1}^-'^ ^ R be a dual witness for the (l/3)-approximate degree of AND 



M 



whose existence is guaranteed by Theorem [TJ There is some e > 1/3 and d = Q{y/M) such that ^' satisfies: 

^'(a;)ANDM(x) =e, (4) 

a;G{-l,l}*^ 

E i*(-^)i = i' (5) 

E *(a;)xs(a;) = for each |5| < d. (6) 
xe{-i,i}" 

Likewise, let ip be the dual witness for the (1 — (e — l/3)/4)-approximate degree of ORjv- By Thm. [TJ 
there is some 5 < {e — l/3)/4 and some d' = Q{\/N) such that tp satisfies: 

E ^{x)ORn{x) ^1~S, (7) 

a;G{-l,l}" 

E 1^(^)1 = 1' (8) 

xG{-l,l}" 

E V^(a;)xs(a;) for each IS"! < d'. (9) 

a;e{-l,l}" 

Wc will also make use of the following easy lemma, which tells us the precise values of 7/1(1 at) and 1m)- 
This is essentially a restatement of a result due to Gavinsky and Sherstov [13j . 

Lemma 4. 

l-d^ E i^ix)0RNix) = 2ip{lN). (10) 
£ce{-i,i}" 

In particular, 'iP{1n) > 0. Similarly, 

e= E *(a:)ANDM(x) = -2*(-lM). (11) 

xei-i.!}*-' 

In particular, ^'(—Ijv/) < 0. 
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Proof of Lemma^ The first part follows because X]a;g{-i i}" V'(2;)0RAr(a;) = 2ip{l) — X^xef-i i}" '^(^)- 
The second term on the right-hand side is zero because ip is orthogonal to all polynomials of degree at most 
d, and in particular ip is orthogonal to the constant function. The proof for the second part is similar. □ 

As in Sherstov's proof of Proposition [3l we define C : ({—1, 1}^)^^ ^ M by 

M 

axi,. ..,xm) := 2^^vl-(. . . ,SgH(^(x,)), . . . ) n (12) 



i=l 



where Xi = . . . ^x^^n)- 

By Thm. [1] in order to show that C is a dual witness for the fact that the (l/3)-approximate degree of 
AND-ORjl/ is n{VMN), it sufBces to show that 

C(a^i,---,a^A/)AND-ORjV'(xi,...,XA,) > 1/3. (13) 

(xi,...,x„)e({-i4}")^' 

\axu...,XM)\ = i. (14) 

C{xi,. . .,XM)xsixi,. ■ . ,xm) =0 ioT each \S\ < d ■ d'. (15) 

(:ri,...,a:„)e({-l,l}")" 

Eq. (jl5p is proved exactly as in |36| : we provide Sherstov's argument in Appendix I A . 2 1 for completeness. 
We now argue that Expression ([T5|) and Eq. ([Til) hold as well. 

Proof of Eq. p4^ . Let /i be the distribution on ({ — 1, 1}^)^''^ given by ^J.{xi, . . . ,xm) = Yliti IV'(2^i)l- Since 
ip is orthogonal to the constant polynomial, it has expected value 0, and hence the string (. . . , sgh(?/'(a;i)), . . . ) 
is distributed uniformly in { — 1, 1}^^ when one samples {xi, . . . , Xm) according to /i. Thus, 

Y ic(xi,...,xm)|- E 

(xi,...,:tM)6({-14}")*' 

by Eq. ©, proving Eq. (HH). □ 
Proof of Expression \1S^) . Using the same distribution ji as in the proof of Eq. (|14p . observe that 

Y xm) AND-ORjl/ ixi,...,XM) 

= 2^E^[^'(. . . ,Sgii(V'(xO), . . . )ANDm (. . . , ORjv(x,), ■ • ■ )] 
= J2 *(^) ANT)M{---,ORN{x^),...)fi{xi,...,XM\z)\ , (16) 

where //(x|z) denotes the probability of x under /i, conditioned on (. . . ,sgn{'ijj{xi)), . . 

Let Ai = {x e {-1,1}^ : ip{x) > 0,ORAr(a;) = -1} and A^i ^ {x e {-1,1}^ : ip{x) < 0,ORAr(a;) = 
1}, so Ai U A-i is the set of all inputs x where the sign of ip{x) disagrees with ORAr(a::). Notice that 
J2xeAiUA 1 ^ "^/^ because ip has correlation 1 — i5 with /. 

As noted in [35], for any given z E { — 1,1}^^, the following two random variables are identically dis- 
tributed: 

• The string (. . . , OKNixi), . . . ) when one chooses {. . . , Xi, . . .) from the conditional distribution fJ-{-\z). 
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• The string (. . . ,yiZi, . . . ), where y € {—1, 1}*^ is a random string whose ith bit independently takes 
on value —1 with probability 2^^^^ < S. 

Thus, Expression p6)) equals 

^ vI/(^).E[ANDm(...,2/,2.,...)], (17) 

z£{-l,l}''' 

where y £ { — 1, l}''^^ is a random string whose ith bit independently takes on value —1 with probability 

We first argue that the term corresponding to z = —1m contributes ^'(z) to Expression (|17p . By Eq. ((TU)) 
of LemmaHl if ORAr(a:;) ~ 1 (i.e. if x = In), then sgn.{tjj{x)) ~ 1. This implies that A^i is empty; that is, 
if sgn{tp{x)) = —1, then it must be the case that ORAr(z;) ~ —1. Therefore, for z = —1m, the j/i's are all 
— 1 with probability 1, and hence Ey[ANDA/ (. . . , y^z^, ...)]= ANDm (— Ia/) = —1- By Part 2 of LemmaHl 
sgfi(^(— 1a/)) = —1, and thus the term corresponding to z = —1a/ contributes — ^'(z) to Expression ^T7\ as 
claimed. 

All z 7^ —1a/ can be handled as in Sherstov's proof of Proposition [31 because ANDj\/ has low block 
sensitivity at these inputs. To formalize this, we invoke the following proposition, whose proof we provide 
in Appendix lA.il for completeness. 

Proposition 5 {^). Let F : {-1,1}^^ -> {-1,1} be a given Boolean function. Let y S {—1,1}*'' be a 
random string whose ith bit is set to —1 with probability at most a € [0, 1], and to +1 otherwise, independently 
for each i. Then for every z E { — 1, 1}*^, 

Py[F{zi, . . .,zm) ^ F{ziyi, . . . ,ZMyA/)] < 2abs^(F). 

In particular, since bSz(ANDj\/) = 1 for all z 7^ —1a/, Proposition [S] implies that for all z 7^ — Ia/, and 
F = ANDm, Fy[F{zi,...,ZM)^Fiziyi,...,Zkyk)] >l-2S. 

Recall that the term corresponding to z = —1a/ contributes — 5'(— 17\/) to the sum, we obtain the 
following lower bound on Expression (jl7p . 



*(^) • E[ANDm (. 



>-*(-1m)+| J2 * W ANDm (z) -4(5 ^ l^-W 
>( Y *(z)ANDm(z) -4(5 = e-4(5>l/3. 

2g{_l_l}M / 



This completes the proof of Theorem [2l 



□ 



Remark 6. Spalek \40f has exhibited an explicit dual witness showing that the e-approximate degree of both 
the AND function and the OR function is U,{^Jn), for e = 1/14 (in fact, we generalize Spalek's construction 
in the next section to any symmetric function). It is relatively straightforward to modify his construction to 
handle any constant e € (0, 1). With these dual polynomials in hand, the dual solution given in our proof 
is completely explicit. This answers a question of Spalek 14 0\ Section 4] in the affirmative. 
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4 Dual Polynomials for Symmetric Boolean Functions 



In this section, we construct a dual polynomial witnessing a tight lower bound on the approximate degree of 
any symmetric function. The lower bound we recover was first proved by Paturi [26] via a symmetrization 
argument combined with the classical Markov-Bernstein inequality from approximation theory (see Section 
[51). Paturi also provided a matching upper bound. Spalek [30] presented a dual witness to the V,{y/n) 
approximate degree of the OR function and asked whether one could construct an analogous dual polynomial 
for the symmetric t-threshold function [40l Section 4]. We accomplish this in the more general case of 
arbitrary symmetric functions by extending the ideas underlying Spalek's dual polynomial for OR. 

4.1 Symmetric Functions 

For a vector a; S {—1,1}", let |a;| = (xi + ••• -|-a;„)) denote the number of —I's in x. A Boolean function 

/ : {—1, 1}" — > {^1, 1} is symmetric if f{x) — f{y) whenever \x\ = \y\. That is, the value of / depends only 
on the number of inputs that are set to —1. The simplest symmetric functions are the t-threshold functions: 

1 1 otherwise. 

Important special cases include OR ~ ti , AND = r„ , and the majority function MAJ — ■ Let [n] = 

{0, 1, . . . , ri}. To each symmetric function /, we can associate a unique univariate function F : [n] — > { — 1,1} 
by taking f (|a;|) = f{x). Throughout this section, we follow the convention that lower case letters refer to 
multivariate functions, while upper case letters refer to their univariate counterparts. 

We now discuss the dual characterization of approximate degree established in Thm. [1] as it applies to 
symmetric functions. Following the notation in |40| . the standard inner product p ■ q = X]a:e{-i i}" P{^)l{^) 
on symmetric functions p, q induces an inner product on the associated univariate functions: 

We refer to this as the correlation between P and Q. Similarly, the £i-norm \\p\\i — X^aief-i i}" |p(^)l 
induces a norm = X^'iLo Ci)-^(*)- These definitions carry over verbatim when / is real- valued instead 

of Boolean- valued. 

If / is symmetric, we can restrict our attention to symmetric in the statement of Thm.[l] and it becomes 
convenient to work with the following reformulation of Thm. [1] 

Corollary 7. A symmetric function f : { — 1,1}" — > { — 1,1} has e-approximate degree greater than d iff 
there exists a symmetric function (f> : {—1, 1}" — > M with pure high degree d such that 

(Here, F and $ are the univariate function associated to f and (f>, respectively). 

We clarify that the pure high degree of a multivariate polynomial cfi does not correspond to the smallest 
degree of a monomial in the associated univariate function $. When we talk about the pure high degree of 
a univariate polynomial $, we mean the pure high degree of its corresponding multilinear polynomial (f>. 

We exploit the following method for constructing polynomials of pure high degree d. Let ■0 be a multi- 
variate polynomial of degree n — d, and let X[n]{x) denote the parity function on n variables. Consider the 
function 4'{x) = '>p{x)x[n] (x), i.e. (j) is obtained by multiplying -0 by the parity function. It is straightforward 
to check that </> has pure high degree d. Notice that if "0 is symmetric, then so is 0, and the corresponding 
univariate polynomials satisfy $(fc) = ^(fc) • (— 1)*^. Therefore, to show that a symmetric function / with 
a "jump" at t has approximate degree greater than d, it is enough to exhibit an (n — d)-degree univariate 
polynomial 4* such that (— l)*4'(i) has high correlation with its associated univariate function F. 
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We are now in a position to state the lower bound that we will prove in this section. Paturi [25] completely 
characterized the approximate degree of a symmetric Boolean function by the location of the layer t closest 
to the center of the Boolean hypercube such that F{t — 1) 7^ 

Theorem 8 f [26[ Theorem 4]). Given a nonconstant symmetric Boolean function f with associated univari- 
ate function F, let r(/) = min{|2t - n - 1| : F{t-l) 7^ F{t), 1 < k < n} . Then dcg(/) e(^n(n - r(/)). 

Paturi proved the upper bound non-cxplicitly by appealing to Jackson theorems from approximation 
theory, while Sherstov [3D] gave an explicit approximating polynomial yielding the upper bound. Paturi 
proved the lower bound by combining symmetrization with an appeal to the Markov-Bernstein inequality (see 
Section [5]) - his proof does not yield an explicit dual polynomial. We construct an explicit dual polynomial 
to prove the following proposition, which is easily seen to imply Paturi's lower bound. 

Proposition 9. Given f and F as above, let 1 < t < ?? be an integer with F{t — 1) 7^ Then 
(fcg(/) = n{^t{n-t + l)). 

In particular, the approximate degree of the symmetric t-threshold function is D,{y^t(n — t + 1)). This 
special case serves as a useful model for understanding our construction. 

4.2 Proof Outline 

We start with an intuitive discussion of Spalck's construction of a dual polynomial for OR, with the goal of 
elucidating how we extend the construction to arbitrary symmetric functions. Consider the perfect squares 
S = {k"^ -.P < n} and the univariate polynomial 

R{^)--, n (^-^)- 

ie[n]\S 

This polynomial is supported on S, and for all k £ S, 



z| n.es|fc2-z|- 

Note the remarkable cancellation in the final equality. This quotient is maximized at = 1. In other words, 
the threshold point t = 1 makes the largest contribution to the £1 mass of R. Moreover, one can check that 
i?(0) is only a constant factor smaller than R(l). 

Spalek exploits this distribution of the £1 mass by considering the polynomial P{x) ~ R{x)/{x — 2). The 
values oiP{x) are related to R{x) by a constant multiple for a; = 0, 1, but P{k) decays as \P{k'^)\ ss |i?(fc-^)|/fc^ 
for larger values. This decay is fast enough that a constant fraction of the £1 mass of P comes from the point 
P(0)E| NowP is an (n— fi(-y/n))-degrec univariate polynomial, so we just need to show that Q{i) = (— l) 'P(i) 
has high correlation with OR. We can write 

Q-OR = 2Q(0)-Q-l = 2g(0), 

since the multilinear polynomial associated to Q has pure high degree fl{y/n), and therefore has zero corre- 
lation with constant functions. Because a constant fraction of the £1 mass of Q comes from (5(0), it follows 
that \Q ■ OR l/IIQII 1 is bounded below by a constant. By perhaps changing the sign of Q, we get a good dual 
polynomial for OR. 

A natural approach to extend Spalek's argument to symmetric functions with a "jump" at t is the 
following: 

^It is also necessary to check that P{2) is only a constant factor larger than P(0). 



|i?(fc2)| 



Uteln] 

nl n*es 



|fc2 
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Step 1: Find a set S with \S\ = Q{y^t{n — t + 1)) such that the maximum contribution to the £i norm of 
R{x) = ^ nie[n]\5('''^ ^ comes from the point x = t. Equivalently, 

is maximized at j = i. 

Step 2: Define a polynomial P{x) = i?(a;)/(a- - (t - l))(x - (t + 1)). Dividing i?(a;) by the factor {x-t- 1) 
is analogous to Spalek's division of R{x) by {x — 2). We also divide by {x — t + 1) because we will 
ultimately need our polynomial P{x) to decay faster than Spalek's by a factor of \x — 1\ as x moves 
away from the threshold. By dividing by both (x — t — 1) and {x — t + 1), we ensure that most of 
the £i mass of P is concentrated at the points t — l,t,t + 1. 

Step 3: Obtain Q by multiplying P by parity, and observe that Q{t— 1) and Q{t) have opposite signs. Since 
F{t — 1) and F{t) also have opposite signs, we can ensure that both t 1 and t contribute positive 
correlation. Suppose these two points contribute a 1/2 + e constant fraction of the ^i-norm of Q. 
Then even in the worst case where the remaining points all contribute negative correlation, Q ■ F is 
still at least a 2e fraction of and we have a good dual polynomial. Notice that the pure high 

degree of Q is jS*! + 2, yielding the desired lower bound. 

In Section we carry out this line of attack in the case where t ~ il{n). This partial result also gives 
the right intuition for general t, although the details are somewhat more complicated. Namely, in Step 3, 
we may need to rely on the alternative points t and i + 2 to contribute high positive correlation between F 
and Q, rather than inputs t — 1 and t. 

4.3 A Dual Polynomial for MAJ 

We first construct a dual polynomial that witnesses an Q{t) lower bound for symmetric functions having 
a "jump" at i < n/2. Notice that this bound matches Proposition [9] if t ~ ^{n), but is weaker otherwise 
(e.g. in the case of OR). By setting t ~ 1^1, we can write down down a clean dual polynomial for the 
majority function MAJ. This case is illustrative, as one can view Spalek's dual polynomial for OR and our 
dual polynomial for MAJ as two ends of a spectrum, with our general construction interpolating between 
the two extremes. 

Proposition 10. Let f : {—1, 1}" — > { — 1, 1} be a Boolean function with associated univariate function F. 
Ifl<t<n/2 such that F{t - 1) ^ F{t), then deg(/) = n{t). 

Proof. We follow the proof outline given in the previous section. Define the set 

S = {t±4e:0<e< t/4}. 

Note that \S\ = n{t). We claim that TTs{i) := Yijes j^i b ~ *l minimized at i = t. Notice that translating 
all points in S* by a constant does not affect T^sii), and scaling all points in 5 by a constant does not affect 
argminj7r5(i). Thus, it is enough to show that tt^. (i) is minimized at i = for the set S* = {±£ : £ < t}. In 
this case, TTs*{i) takes the simple form {t — i)l{t + and we see that 

7rs-(0) _ (t!)^ _ t t-l t-\i\ + l 

I'M*) ^ {t-i)\{t + i)\~ t+\i\' t+\i\-\ t + l 

is a product of terms smaller than 1, so 7rg(i) is indeed minimized at i = 0. 

With Step 1 completed, we let T = 5 U {t — 1, < + 1} and define the polynomial 

p(x) = (-ir^^^ n (^-^■)' 
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where h = [t/4j and s is a sign bit to be determined later. The normalization is chosen so that (") \P{t)\ = 1. 
We divide by both {x — {t— 1)) and {x — {t + 1)) to ensure that the rate of decay of P{x) is at least quadratic 
as X moves away from t. This will ultimately allow us to show that most of the ii mass of P comes from the 
points X — t ^ 1 and x ~ t. 

Write the li contribution due to the point r as 



T\.]<-T\{r}\^ - ^\ njeT\{r}k-jl 



For ?■ = t ± 1 this becomes 

42''(/i!) 



2nti(4^-l)(4^+l) 2n(l+i6^2_i 



< ie-V90 < 1^ 

where the first inequality holds because 1 + a; < for all a; > 0. This shows that the £i contributions of the 
points t — 1 and t + 1 arc equal, and not too large: 

Now we analyze the remaining summands, and show that their total contribution is much smaller than 
1. Recall that the choice i = t minimizes 7r5(i), and that 7rs(t) = 4^''(ft,!)^. Therefore. 

.t + W n,en{*+«}l* + 4^-^'l " |4^+1||4^-1| -15^' 

We can use this quadratic decay to bound the total li mass of the points outside of {t— 1, i, ^ + 1}: 

jes\{t} ^-"^ e=-h 

For the final part of our construction, we multiply P by parity to get Q{i) = (— l)*P(i). Since P{t — 1) 
and P{t) have the same sign, Q{t — 1) and Q{t) have opposite signs. Since F{t — 1) and F{t) also have 
opposite signs, we can choose s = ±1 to ensure that 

n \,^,. ,M fn\,^,,^, I n \ ^ in 

J 



Q-F>i ''\\P{t-l)\+r\\Pit)\-( '']\Pit + l)\- ^ ( )\P{J)\ 



- 4 4 

As the total £i mass is at most 3+^, we get that {Q-F)/\\Q\\i > j^- By Corollary[71 the ^^-approximate 
degree of f is il{t). □ 

4.4 General Symmetric Boolean Functions 

We now show how to generalize our dual polynomial for MAJ and Spalek's dual polynomial for OR to handle 
arbitrary symmetric functions. Recall that we arc given a Boolean function /, associated univariate polyno- 
mial F, and a number t such that F{t — 1) 7^ P{t)- As our goal is to show that deg(F) = D,{y/t{n — t + 1)), 
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we may without loss of generality assume that t < n/2 throughout this section. As the case of t = 1 is 
handled by Spalek's construction, we may also assume t>2 to improve the constants in our analysis. 

As a first attempt at defining a suitable set S for use in constructing a dual polynomial for /, we consider 
the set S' = {ifc^ : fc^ < n/t]. Fact [23] in Appendix [B] implies that OieS' 1-^ ~ *l minimized at t. 

Unfortunately, the set S' is too small - it has size only Q{yjn/t) instead of 9 {^■\/t{n — i + 1)^ . The trick 

is to notice that the points in S' are separated by at least t. Therefore, we should be able to interlace 9(i) 
translated copies of S", and still have the desired product minimized near t. The following lemma gives the 
details, but its proof is rather technical and deferred to Appendix [BJ 

Lemma 11. Let 

S = {tk^ +M:l<k< ^{n-t + l)/t, Q <^<ct}\J{t-U■.Q<^<ct} 
where c < 1/32. Then for i S, the product 

'Tsw n N-^'i 

i'es 

i' 

is minimized for some i* with (1 — Ac)t < i* < (1 + 4c)t. 

Thus the product of differences is minimized somewhere in a 9(t)-sized neighborhood of t. The exact 
location depends delicately on n and t. However, since TTs{i) product is invariant under translations of S, 
we can assume that the minimizer i* is one of the points t — 1, t, or t + 1. 

For intuition, observe that one can view the set S of Lemma [TT] as interpolating between the set for OR 
used by Spalek, and the set used to prove our il{t) lower bound in Proposition (TU] Notice that S contains 
all points of the form t ± 4£, plus additional points corresponding to perfect squares when t = o(?i). 

Let S be the set in Lcmma [TT] (or a translate thereof), and suppose the corresponding product is minimized 
at i*. Let T = S U {i* — + 1}. Define the polynomial 

p(-) = (-ir^ n (--^■)' 

n! 

j<£[n]\T 

where the sign bit s is to be determined later. The choice of normalization is so that (j")|-P(j*)| = 1- Our 
goal is now to show that the £i mass of P is concentrated at the points i* — and i* + 1. The following 
lemma shows that the contribution of a point x to the ii mass of P decays at a quadratic rate as x moves 
away from i* . This is precisely because we include both points z* — 1 and i* + 1 in T. Full details are in 
Appendix [B] 

Lemma 12. Let r = tk'^ + U for k > 2 and < i < ct. Then (")|P(r)| < l/{t'^{k'^ - 2)2). If v = i* + U, 
then{"J\Piv)\<l/{me^-l). 

Since the sum of the inverse squares of the integers is bounded by a constant, the total contribution of 
these points to 11^111 is dominated by the mass contributed by P{i*). 

Lemma 13. 

jeT\{i*-i,i*,t*+i} ^ 
This bound allows us to sketch a proof of Proposition [9] in full generality. 

Proof sketch of Proposition\^ We consider three cases based on which of {^,"_-^\P{i* ~ 1)|, (")|^(**)| — 1, 
and + 1)| is the smallest. The relationship between these terms determines how we chose the 

location of i* relative to the "jump" at t. We set i* so that after multiplying P by parity to obtain a 
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polynomial Q, the larger two of these terms contribute positively to Q ■ F. They will hence dominate the 
(possibly negative) correlation due to the smallest term, as well as the contribution of size at most 2/5 due 
to remaining points in T. Ultimately, we show that (Q ■ F)/\\Q\\i < jg, which gives the asserted lower bound 
by Corollary [71 The calculations for each of these cases arc analagous to those in the proof of Proposition [TU] 
and given in Appendix |B1 

□ 

5 A Constructive Proof of Markov-Bernstein Inequalities 

The Markov-Bernstein inequality for polynomials with real coefficients asserts that 

p{x) < min| ^^_^-= ,n^| |lp|| a; G (-1,1) 

for every real polynomial of degree at most n. Here, and in what follows, 

Ibllhiai— sup \p{y)\. 
■ye[-i.i] 

This inequality has found numerous uses in theoretical computer science, especially in conjunction with 
symmetrization as a method for bounding the e-approximate degree of various functions (e.g. pil51 [TBl[roil25l 

miiiina). 

We prove a number of important special cases of this inequality based on linear programming duality. 
Our proofs are constructive in that we exhibit explicit dual solutions to a linear program bounding the 
derivative of a constrained polynomial. 

The special cases of the Markov-Bernstein inequality that wc prove are sufficient for many applications 
in theoretical computer science. The dual solutions wc exhibit arc remarkably clean, and we believe that 
they shed new light on these classical inequalities. 

5.1 Proving the Markov-Bernstein Inequality at x = 

The following linear program with uncountably many constraints captures the problem of finding a polyno- 
mial p{x) = Cnx" -t- c"~} + • • • + cix + Co with real-valued coefhcients that maximizes |p'(0)| subject to the 
constraint that ||p||[-i,i] < 1- Below the variables are cq, . . . c„, and there is a constraint for every x S [—1, 1]. 



max ci 

such that X]"=o ^i-'-'' — 1j ^ 1] 

-Eto^^^" < 1, Vxe [-1,1] 

One might initially be concerned that our goal is to bound |p'(a:)|, while the above LP only yields an 
upper bound on p'{x). But for any polynomial p satisfying < 1 whose derivative is negative, —p is 

a feasible solution to the above LP achieving value |p'(a;)|. Thus, the value of the above LP indeed equals 
suppg^^ b'(0)l: where B denotes the set of all degree n polynomials p satisfying < 1. 

We will actually upper bound the value of the following LP, which is obtained from the above by throwing 
away all but finitely many constraints. Not coincidentally, the constraints that we keep are those that are 
tight for the primal solution corresponding to the Chebyshev polynomials of the first kind. Throughout this 
section, wc refer to this LP as Primal. 



max ci 

such that X]"=o'^«^' — 1' V.t = cos(fc7r/n), fc e {0, 2, . . . , n — 1} 
~ X)"=o — Vx = cos(fc7r/ri), k G {1, 3, . . . , n} 

The dual to Primal can be written as 
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such that Ay = e 

% >OVjG {0,...,n} 



where Ajj ~ (—1)"' cos*(j7r/n) and e = (0, 1, 0, 0, 0, ... , 0)^. We refer to this hnear program as Dual. 

Our goal is to prove that Primal has value at most n. For odd n, it is well-known that this value is 
achieved by the coefficients of (— l)*^"~^^/^r„(a;), the degree n Chebyshev polynomial of the first kind. Our 
knowledge of this primal-optimal solution informed our search for a dual-optimal solution, but our proof 
makes no explicit reference to the Chebyshev polynomials, and we do not need to invoke strong LP duality; 
weak duality suffices. 

Our arguments make use of a number of trigonometric identities which can all be established by elemen- 
tary methods. These identities are presented in Appendix [Cl 

Proposition 14. Let n = 2m+l be odd. Define the (n+l) x (n + l) matrix A by Aij = (— 1)^+™ cos'(j7r/n) 
for < i, j < n. Then 

y = -(l/2,sec2(7r/n),sec2(27r/n),. . .,sec^((n - l)7r/n), 1/2)'^ 
n 

is the unique solution to Ay = ei, where ei = (0, 1,0,0,..., 0)^. 

Before proving the proposition, we explain its consequences. Note that y is clearly nonnegative, and thus 
is the unique feasible solution for Dual. Therefore it is the dual-optimal solution, and exactly recovers the 
Markov-Bernstein inequality at x = 0: 

Corollary 15. Let p be a polynomial of degree n = 2m + 1 with ||p||[-i.i] < 1. Then p'(0) < n. 

Proof. Let y be as in Proposition [TH This is the unique feasible point for Dual. By Lemma in Ap- 
pendix [C] 



y sec I — \ = n 

so we immediately see that J2]=o Vi ~ weak LP duality, the value of Primal is at most n. □ 

While we have recovered the Markov-Bernstein inequality only for odd-degree polynomials at zero, a 
simple "shift-and-scale" argument recovers the asymptotic bound for any x bounded away from the endpoints 
{-1,1}- 

Corollary 16. Let p be a polynomial of degree n with < 1. Then for any xq G ( — 1, 1), |p'(xo)| < 

i-fxol lbH[-i.i]- particular, if for any constant e G (0, 1), ||p'|| = 0(?7.)||p||[_i4]. 

Proof. Assume without loss of generality that G [0,1) - an identical argument holds if € 
By Corollary [m |(z'(0)| < {n + l)\\q\\[_i ^ for any polynomial q of degree at most n. Define the degree- 
n polynomial q{x) ~ p((l — xo)x + xq). Since (1 — a^o)^; + G [—1,1] for every x G [—1,1], we have 
< Moreover, q'{x) = p'((l - xn)x + xo){l - xq), so ^'(O) = p'(a;o)(l - xq). Therefore, 



< 



\xo\ - l-|a;o|"^"^-^'^' 

□ 



We remark that the full Markov-Bernstein inequality guarantees that |p'(x)| < :y^=f [-1,1], which 
has quadratically better dependence on the distance from x to ±1. However, for x bounded away from 
±1 our bound is asymptotically tight and sufficient for many applications in theoretical computer science. 
Moreover, we can recover the Markov-Bernstein inequality near ±1 by considering a different linear program 
(cf. Subsection l5.2|) . 
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Proof of Proposition \14\ We write 



2n 



^ n— 1 



(18) 



Our goal is to show that {Ay)i ~ 1 for i = 1, and (A?/)^ = for all other i. The case where i is even is easy. 
Since cos(7r — 6) = — cos 6, the terms in the sum naturally pair up. Specifically, 



(-l)J'+"cos'-2 (^—)+ (-l)(»-J")+" 



cos-2 f I . 0, 



so the sum in Eq. is clearly zero. 

Now suppose i is odd and larger than 1. Then Lemma in Appendix [Cl implies that {Ay)i = 0. All 
that remains is the case of « = 1. We write the sum explicitly as 



i-iy 



n 



71 ^ ^ 



J+™sec' 



By Lemma [28] in Appendix [Cl this evaluates to 1. 



□ 



5.2 Proving the Markov-Bernstein Inequality at x = 1 

A similar strategy allows us to bound the derivative of a degrec-n polynomial p at the point a; ~ 1. We can 
expand p{x) around 1 as p{x) — c„(a; — 1)" + Cn-i{x — 1)"~^ + • • • + ci(x — 1) + cq. Then p'{l) — c\. A 
modest update to Primal captures the problem of maximizing p'(l) subject to boundedness constraints at 
the Chebyshev nodes. 



max Cl 

such that X]"=o Ci(x — 1)* < 1, Va; = cos(j7r/n), < j < J even 
— 'YTi=Q ^ 1)* — 1' = cos(j7r/n), < j < n, j odd 



The dual linear program takes the form 



En 

such that By = ei 

>OVje {0,...,n} 



where Bq.o = 1 and Bij = (— 1)-' (cos(j7r/n) — 1)* otherwise. The determinant of B is, up to sign, a 
Vandermonde determinant, and in particular is nonzero. Thus, By = ei has a unique solution. Again, we 
can write down this solution explicitly. 

Proposition 17. Let n he a natural number, and define the (rt + 1) x (n + 1) matrix B as above. Then 



y 



2n^ + 1 

r 'CSC I 7^ 1 , CSC- , 

6 \2n/ \n 



CSC 



(n-l)7r\ 1 



2n r 2 



is the unique solution to By = ei. 

Proof. We just need to show that By = ei. First, if i 7^ then 

i=i ^ 



{Byy. 



(-i)"(-2r 



^ csc^ 1^1 I cos 
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Using the half-angle identity sin^(6'/2) = (1 — cos0)/2, this becomes 

n-l / ■ \ 

{By). = (-l)^+"2-i + (-1)'2' ^(-1)^ sin2-2 . 

If i = 1, then the sine terms are identically 1 so {By)i evaluates to 1. If i > 1, then by Lemma [57] in 

Appendix [Cl the sum of sine terms evaluates to i( — 1)" — (—1)" = —^(—1)". Therefore, {By)i = for all 
i > I. 

Finally, we need to show that {By)o ~ 0. We expand 

By Lemma [311 this evaluates to 0. □ 
Corollary 18. If p is a polynomial of degree n, then p'{l) < . 

Proof. Let y be as in Proposition flTl Notice that yj > for all j € {0, . . . ,n}. Combined with Proposition 
[TTl it is clear that y is dual- feasible. By Lemma l30l 

n o ? , 1 1 n—1 / . 

Vi = \ ^ > csc^ — 

2 4n2 - 4 , 

= 1 \ = n^. 

3 3 6 

□ 

By combining Corollary [TO] with a shifting and scaling argument similar to the one used to prove Corol- 
lary 1161 we recover an asymptotic statement of Markov's inequality for the first derivative of a constrained 
polynomial. 

2 

Corollary 19. If p is a polynomial of degree n, then for all xo G [—1, 1] with xq 7^ 0, |j''(xo)| < j^IIpII [-1,1] • 
Thus, for any constant e G (0,1), ||p'|| [-i,-e]u[e,i] — 0(n^)||p||[_i^i]. 

Proof. The argument is the same as in the proof of Corollary 1161 except we instead use the auxiliary 
polynomial q{x) ~ p{\xq\x). □ 

Combining this with Corollary [TBI "we recover an asymptotically tight version of Markov's inequality for 
the whole interval [—1, 1]. 

Corollary 20. If p is a polynomial of degree n, then for all x G [—1, 1], < 0(n^)||p|| [_i 1] . 



5.3 Markov's Inequality for Higher Derivatives 

In 1892, V. Markov proved the following generalization of the Markov-Bernstein inequality to higher deriva- 
tives. Let p be a real polynomial of degree at most n, and let r„ be the nth Chebyshev polynomial of the 
first kind. Then 

pW(^)<Tf)(l)Mi[_i^i] 

for every x G [—1,1]. We use complementary slackness to prove an important special case of this inequality, 
namely that p('=)(l) < T^^^(l)|b||[_i4]. 

While A. A. Markov's inequality for the first derivative has a short proof (see [TT] for a proof using tools 
from approximation theory) , the generalization to higher derivatives is considered a deep theorem |29j . The 
shortest known proof of this theorem proceeds in two steps [29l Section 3.1]. In the first step, it is shown 
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that among all points x £ [—1,1], the quantity sup^g^ |p'^'^)(a;)| is maximized at a: = 1, where again B is the 
set of degree n polynomials p with real coefficients such that ||p||[_i.i] < 1. In the second step, it is shown 

that p('=)(l) < ri''^(l)||p||[_i_i]. It is this second step that we prove here using complementary slackness. 

The following lemma, found in [15] , relates the determinant of a Vandermonde matrix that skips a row to 
the determinant of an ordinary Vandermonde matrix. For each integer < fc < n, we define the elementary 
symmetric polynomial 



ek{xi,. . . ,Xn) 



E 



i<jl<j2<-<jk<n 



Lemma 21. Let < k < n. Then 



1 

Xi 



1 



1 

Xl 



1 

X2 



r.n-1 



Proposition 22. Let p be a polynomial of degree n with \p{x)\ < 1 for x G [—1, 1]. Then 

pW(l)<TW(l) 

where Tn{x) is the n-th Chebyshev polynomial of the first kind. 

Proof. This is obvious if fc = 0, since T'„(l) = 1, so we assume fc > 0. Recall the expansion p{x) = 
Cn{x — 1)" + Cn-i{x — 1)"^^ + • • • + Ci (x — 1) + cq . Thcn the fc-th derivative of p at 1 is simply k\ck. We 
consider the linear program 



max klck 
such that X)"=o 



ly < 1, 

1)* < 1, Va; 



cos(j7r/n) 
cos(j7r/n) 



< J < n, j even 
< J < 7T., j odd 



and its dual 



En 

such that By = klek 

% >OVj€ {0,...,7i} 



where i?o.o = 1 and Bij = (— 1)^ (cos(j7r/?i) — 1)' otherwise. Notice that all primal constraints are tight for 
the primal solution corresponding to T„, the degree n Chebyshev polynomial of the first kind. 

The determinant of B is, up to sign, a Vandermonde determinant, and in particular is nonzero. Thus, 
By = klek has a unique solution. If we can show that this solution has positive entries, complementary 
slacknesss (cf. [27l pg. 95]) implies that T„ is a primal optimal solution, and the result will follow. 

We now use Cramer's rule to investigate the solution to By — klek- Recall that Cramer's rule tells us 
that entry yj is given by det Bj / det B where the matrix Bj is obtained from B by replacing its jth column 
with klek. Using the formula for the Vandermonde determinant, deti? is given by 



(-i)L(-^)/^j n (-^(^)--(if 

0<J<i'<n 



Since cos(.t) is a decreasing function on the interval [0, tt], all the terms in the product are negative. So the 
sign of detB is (-l)L(»+i)/2J+(;). 
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For convenience, let Uj = cos(j7r/ri) — 1. Consider the numerator of Cramer's rule for entry yj. This is 
the determinant of the matrix Bj^ 



/I -1 

— Qfi 



-a\ 



(-1)^-1 (-ip+i ... (-1)" \ 

... (-l)"(-2) 

k\ (-1)^+14+1 ... (-l)"(-2)^ 



VO -a? ... {^iy-^a^_, ... (-1)"(-2)V 

Taking the cofactor expansion along the replaced column, and factoring out —1 from each of the appropriate 
columns gives 





1 .. 


1 


1 


1 




.. 




Oij + l . 


-2 




.. 
.. 


fc-1 
fe+1 


fc-1 
a +1 . 

fc+i 

"j+i • 


. (-2)^-1 
. {-2f+^ 




.. 


a" 1 


"i+i • 


■ (-2)" 



The matrix satisfies the conditions of Lemma I21[ so we can write this as 



0<i<i'<n 



There are ("2 ^) strictly negative terms in the product, and as long as fc > 0, e„-k has sign (—1)" So the 
sign of the whole product is (-i)l-("+i)/2J+"+(";'). Dividing by the sign of detS, we get = 
1. □ 



6 Conclusion 

The approximate degree is a fundamental measure of the complexity of a Boolean function, with pervasive 
applications throughout theoretical computer science. We have sought to advance our understanding of this 
complexity measure by resolving the approximate degree of the AND-OR tree, and reproving known lower 
bounds through the construction of explicit dual witnesses. Nonetheless, few general results on approximate 
degree are known, and our understanding of the approximate degree of fundamental classes of functions 
remains incomplete. For example, the approximate degree of AC*^ remains open [216] . as does the approximate 
degree of approximate majority (see [MJ Page 

Resolving these open questions may require moving beyond traditional symmetrization-based arguments, 
which transform a polynomial p on n variables into a polynomial g on m < n variables in such a way that 
deg(q) < deg(p), before obtaining a lower bound on deg{q). Symmetrization necessarily "throws away" 
information about p; in contrast, the method of constructing dual polynomials appears to be a very powerful 
and complete way of reasoning about approximate degree. Can progress be made on these open problems 
by directly constructing good dual polynomials? 

Acknowledgements. We are grateful to Ryan O'Donnell and Li- Yang Tan for posing the problem of proving 
Markov-type inequalities via the construction of a dual witness, and to Karthekeyan Chandrasekaran, Jon 
Ullman, and Andrew Wan for valuable feedback on an early version of this manuscript. 

^This open problem is due to Srikanth Srinivasan. 
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A Final Details of Thm. H 

A.l Proof of Proposition [5] 

. Let r = [1/aJ. Then 

Py[F{z) ^ F{ziyi, . . ..zmVm)] < Py[F{z) ^ F{ziWi, . . .,zmWm) for some w^y] (19) 

where w ^ y if : = — 1} C {i : ?/j = — 1}. By monotonicity, it suffices to bound the right hand side 
under the assumption that each bit of y takes the value —1 independently with probability exactly l/r. 

Consider a matrix Y G {—1,1}'"'^" where each column is chosen independently at random from the r 
vectors having a —1 in one slot and a +1 in all the others. Let y^,y'^, ■ ■ ■ ,y^ denote the rows of Y. While 
these rows are not independent, each is individually a random string whose ith bit independently takes the 
value —1 with probability l/r. Thus the right-hand side of Equation 1191 equals 



1 '' 

- ^ Py[F{z) ^F{ziwi,. . . , ZmWm) for some w ^ y^] 



r 



= ^Ey [#{j : F{z) ^ F{ziwi,. . . , zmWm) for some w ^ ?/•'}] 

The latter count has at most bSz(F) nonzero terms because y^, . . . ,y^ are the characteristic vectors of disjoint 
sets. The asserted inequality follows because l/r = l/[l/aj < 2a. □ 



A. 2 Proof of Equation | 

We prove that the polynomial C defined in Eq. satisfies Eq. ([T5]), reproduced here for convenience. 

C(a;i, 2;M)xs(a;i, ... ,a;A/) = for each |5| < d • d'. ([15]) 

{xi,...,XM)e{{-iA}'^)" 

To prove Eq. ([T51) . notice that since \I> is orthogonal on { — 1, 1}^^ to all polynomials of degree at most d, 
we have the Fourier representation 

^{z) = ^iT)xT{z) 

TC{1,....M} 
\T\>d 

for some reals ^'(T). We can thus write 

C(xi,...,xm) = 2*^ ^ ^{T)l[i:{x,)l[mx,)\. 

\T\>d ieT i^T 

Given a subset 5 C {1, . . . ,M} x {1, . . . ,iV} with \S\ < d ■ d! , partition S = ({1} x 5*1) U • • • U ({M} x Sm) 
where each 5"^ C {1, . . . , N}. Then 

^ C,{xi,...,xm)xs{xi,...,xm) 

(xu...,XM)e({-l,l}N)''t 

= 2^^ ^ #(T)n I ^i^^)xsM) I n I E lV'(^.)IX5.(x,:) 



20 



Since 15"! < d • d', by the pigeonhole principle, \Si\ < d! for at least M — d indices i € {1, . . . , M}. Thus for 
each set T, at least one of the underbraced factors is zero, as xSi is orthogonal to ifj whenever \Si\ < d' . 



B Dual Polynomials of Symmetric Functions 

Proof of Lemma[Tl[ Fix an £ such that £ G [[cfj] and let i{k) = tk^ + A£. It is enough to show that 
rii'es i'^i N ~ is minimized at A; = 1. We can expand this product as 



ni 



lct\ 

n 

m=0 



LV("-*+i)/*J 



(tfc2 +4£- (t- 4m)) Jl \tk^ + U - {tf + im)\ x J| 



V 



lct\ 



Ami 



m=0 



Cancelling the factor independent of k and considering each index m separately, we just need to show that 
for any fixed < £,m < ct, the product 

{tk^ +U-(t~ 4m)) Y[ \tk'^ +4,£- {tf + 4m) | 
as a function of fc > 1 is minimized at fc = 1. Divide each factor by t to obtain 



k'-l 



t 



t{k^-p) 



(20) 



Wc first obtain a lower bound for this expression when k > 2. Consider the following two facts. 
Fact 23. Let k < m be nonnegative integers. Then 



n \k'-f\> n \^-^" 



(21) 



je[?Ti] 



In other words, this product of differences of squares is minimized at k = \ . 

Proof of Fact\2^ This is clear if fc = 0, so suppose fc > 2. Then the left-hand side of Eq. (|2T|) can be written 
as 

(m + fc)! 



n n ik+m-j\^ 



1 



jG[m] 



3^k 



2k{k - 1)! ^'^^ - + " 



Taking the ratio of the left-hand side of Eq. ([2T|) to the right gives us 

{m -f fc)!(?7i - fc)! _ (m + fc)(m + fc - 1) . . . (m + 2) 
(m -I- l)!(m - 1)! ^ (m - l)(m - 2) . . . (m - fc + 1) 

which is a product of numbers that are all at least 1. 

Fact 24. Let k be a nonnegative integer. Then 

-^1 TT^ 

^ i72-fc2| - y- 



□ 
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Proof of Fact^^ First suppose j > k. Then 

f -k^^ U - fc)2 + 2jk - 2fc2 > (j - fc)2. 

Thus 

A similar argument holds for j < k. □ 
Combining the two facts, Expression (^01 is at least 



whenever fc > 2. On the other hand, setting A; = 1 in Expression 1201 gives us at most 



-A 



-A 



<8ce.p(^^)l[\l^f 



which is easily verified to be smaller than our lower bound for the k >2 ease if c < 1/32. □ 
Proof of Lemma \12[ 

7rs(«*) Ilj<£[n]\{r}\''' " j\ 



Write\P{r)\ 



nl \r-{t*-l)\\r~{2* + l)\ll,es\{r}\r'j\ 



1 t\ (tL — " 7" } ! 

< —- — -; — TTT by definition of i* 

~ n! |r - (t* - l)\\r - (t* + 1)| 



1 1 

< 



The bound follows since c < 1/32 and t > 2. The calculation for v is similar. □ 

Proof of Lemma \13\ Using the bounds from the previous lemma, as well as the facts that c < 1/32 and 
t > 2, the left hand side is at most 

El x - X ^ 1 x - 1 Ct + 1 X - 1 

16P-1 ^'^^i2(fc2_2)2 - ^ 15£2 + 2^ (fc2 _ 2)2 

e^O k>2 l=a ^ ' (=1 k=2 ^ ' 

- 45 ^ 64 ^ fc4 

fc=2 

~ 45 16 1^90 
2 

< -. 

- 5 

□ 
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Proof of Proposition\^ By symmetry, we can assume that t < n/2. Moreover, we may assmne that t is 
the largest such integer with F{t — 1) 7^ Fit)- We have already handled a few special cases: The case of 
t — 1 corresponds to Spalck's construction for the OR function [30], and the case oi t — fl{n) follows from 
Proposition [TUl We can therefore assume that 2 < < < n/4. We now consider three separate cases based on 
which of the terms {^t"Li)\P{i* — 1)|, 1, {iJ^i)\P{^* + 1)1 is the smallest. In all of these cases, we will show 
that we can construct a polynomial Q such that {Q ■ F)/\\Q\\i > 1/14. 

Case 1: J - 1)|, 1 > J |P(** + 1)|. 

Recall that by translating S by at most 4ct (thereby keeping it a subset of [n]), we can assume that 
i* ~ t. Let Q{i) = (— l)'P(i). Then the multilinear polynomial associated to Q has pure high degree 
|r| = n{^t{n -t+l)). The £1 norm of Q is 



ieT 



-[t-lj '^^^ " + ^ + G 1 1) '^^^ + + 5 Lemniain] 
<2(^!^) 10(^-1)1 + ^. 

Choose the sign bit s in the definition of P so that Q{t) = F{t). Since P{t — 1) has the same sign as P{t), 
it holds that sgH(Q(t - 1)) = sgh(P(t - 1)). Therefore, 

Q-F=( /^)\Qit-l)\ + l+ J2 

>(,",)lQ(t-i)l + i-(j^)lQ(t + i)|-^ 
>5(,_JlW-i)l + 5-5 

Using the fact that {A + B)/{C + D) > min(A/C, B/D) for positive A, B, C, D, 

Q-F ^ 1 
IIQIIi " 14' 

Case 2: 1, G4J|P(z* + 1)| > {,.\)\P{^* - 1)|. 

This time, translate S so that i* = t — 1. We remark that under this translation we still have T C [n], 
since we assumed t > 2. The remainder of the analysis is identical to Case 1, interchanging the roles of t — 1 
with t + l. 

Cases: G.'l J |P(z* - 1)|, (./^ J + 1)| > 1. 

Translate S so that i* = t + 1, and choose p so that Q{t) = (-l)'*^^P(i* — 1) = F{t). Observe that 
F{t + 2) = F{t), since we chose t < n/4 to be the largest such integer with F{t — 1) 7^ Fit)- Then 
Q{t + 2) = (—1)* '^^P{i* + 1) has the same sign as F{t + 2). The £1 norm calculation follows as in Case 1 
to give 

l|g||i< (^)lQWI + (J2)'^^^ + ^^' + I-K^)'^^*^''^K^ + 2)'^^* + ^^'■ 



23 



The correlation with F is 



Q-F= (M|Q(t)|+ (^^ J|Q(t + 2)|+ J2 ( J^WQW 



2\tJ' " 2 Vt + 2, 
so (g-F)/i|Q|li>l/4. 

C Index of Trigonometric Identities 

Lemma 25 ([H No. 429]). 



^ VJ y 2 ^ ^ 2cos(6i/2) 



Lemma 26. Let i < n be odd natural numbers. Then 



B-D^cos^i^l-l- 



Proof. For odd i, consider the well-known power reduction formula 

(j-l)/2 

cos*e' = 2i-' ^ r j cos((i-2fc)6l). 

fc=0 ^ ^ 

Letting 9 = ir/n and applying the previous lemma, 



j=0 ^ ^ A:=0 ^ ' i=0 



^^\ n ^ (_^)„cos((z - 2fc)V2n + - 2fc)7r) 



(i-l)/2 



kj \2 ^ ' 2cos((i - 2k)iT/2r 



Lemma 27. Lef 2i < n. Then 

Proof. Consider the power reduction formula 

sxn^\e) = 2-2' ('^^*') + 2^-2' ^(-1)^-^ Q*") cos((2* - 2k)e). 
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Let 6 = 7r/2n and apply Lemma [ 



cos 



n 



2M , o2-2iY^/ ^^^-k('^i\ / I / , „ cos((i - A:)7r/2n + (i - fc)7r) 



(1 + (-ir)2-2^ r + 2^-^' E(-i)^"' ; u + (-1) 



ij ' \kj\2 ' ' 2cos((i - A:)7r/2ri) 



fe=0 

2^ \ ^9_9t X ^ / ^ \ i_ / 2z \ / I 1 



fe=0 



(1 + (-l)")2-2» ('^.*) + 2^-2' ^(-l)^-fc ('2*') + (-l)"2i-2* E 



fe=0 ^ k=0 



Using the identity 



'2i 



and the symmetry of the binomial coefficients, the first sum evaluates to Therefore, 
E(-l)^ sin2^ (^g^ = (1 + i-ir)2-^'(^^^ + i-l)^'+'2-^'(^^^ + (-l)"2i-2^ (^22*-i - i (^^^* 



Lemma 28. Let n = 2m + I be odd. Then 



E(-lfsecf^)=(-frn+L 



k=0 

Proof. We start with the identity [Ml No. 445] 

n-l 



E tan^ ( ^ + — ) = cot + nd"^ + n{n - 1). 



fe=o 



□ 



^ n , 

k=o ^ ' 

Proof. This follows from the identity [41] 

and the observation that sec(2fc7r/n) = — sec((n — 2fc)7r/n). □ 
Lemma 29. Let n be odd. Then 

Esec^f^)^.^ 



Letting 6 = 0, this evaluates to n{n~ 1) as long as n is odd. Substituting tan^(fc7r/n) = sec^(fc7r/n) — 1 into 
the left-hand side gives the identity. □ 
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An'' 



Lemma 30 ([H Nos. 439, 440]). 

n-l 

Lemma 31. 



Proof. Consider the identity [H Nos. 441, 442] 



J2 csc'm^- + -{{~ir~l). 

j odd 

Let ^ = 7r/2vi and subtract two copies of the second identity from the identity in . Then we get 

□ 
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